Exploiting heterogeneous scientific literature networks to combat ranking bias: Evidence from the computational linguistics area

نویسندگان

  • Xiaorui Jiang
  • Xiaoping Sun
  • Zhe Yang
  • Hai Zhuge
  • Jianmin Yao
چکیده

It is important to help researchers find valuable papers from a large literature collection. To this end, many graphbased ranking algorithms have been proposed. However, most of these algorithms suffer from the problem of ranking bias. Ranking bias hurts the usefulness of a ranking algorithm because it returns a ranking list with an undesirable time distribution. This paper is a focused study on how to alleviate ranking bias by leveraging the heterogeneous network structure of the literature collection. We propose a new graph-based ranking algorithm, MutualRank, that integrates mutual reinforcement relationships among networks of papers, researchers, and venues to achieve a more synthetic, accurate, and lessbiased ranking than previous methods. MutualRank provides a unified model that involves both intraand inter-network information for ranking papers, researchers, and venues simultaneously. We use the ACL Anthology Network as the benchmark data set and construct the gold standard from computer linguistics course websites of well-known universities and two wellknown textbooks. The experimental results show that MutualRank greatly outperforms the state-of-the-art competitors, including PageRank, HITS, CoRank, Future Rank, and P-Rank, in ranking papers in both improving ranking effectiveness and alleviating ranking bias. Rankings of researchers and venues by MutualRank are also quite reasonable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ranking Scientific Articles by Exploiting Citations, Authors, Journals, and Time Information

Ranking scientific articles is an important but challenging task, partly due to the dynamic nature of the evolving publication network. In this paper, we mainly focus on two problems: (1) how to rank articles in the heterogeneous network; and (2) how to use time information in the dynamic network in order to obtain a better ranking result. To tackle the problems, we propose a graphbased ranking...

متن کامل

A Corpus-Based Evalutation of Centering and Pronoun Resolution

In this paperwe comparepronoun resolution algorithmsand introduce a centering algorithm(LeftRightCentering) that adheres to the constraints and rules of centering theory and is an alternative to Brennan, Friedman, and Pollard’s (1987) algorithm. We then use the Left-Right Centering algorithm to see if two psycholinguistic claims on Cf-list ranking will actually improve pronoun resolution accura...

متن کامل

Learning to Rank Answers to Non-Factoid Questions from Web Collections

This work investigates the use of linguistically motivated features to improve search, in particular for ranking answers to non-factoid questions. We show that it is possible to exploit existing large collections of question–answer pairs (from online social Question Answering sites) to extract such features and train ranking models which combine them effectively. We investigate a wide range of ...

متن کامل

Framing Bias in the Interpretation of Quality Improvement Data: Evidence From an Experiment

Background A growing body of public management literature sheds light on potential shortcomings to quality improvement (QI) and performance management efforts. These challenges stem from heuristics individuals use when interpreting data. Evidence from studies of citizens suggests that individuals’ evaluation of data is influenced by the linguistic framing or context of that information an...

متن کامل

Ranking-based readability assessment for early primary children's literature

Determining the reading level of children’s literature is an important task for providing educators and parents with an appropriate reading trajectory through a curriculum. Automating this process has been a challenge addressed before in the computational linguistics literature, with most studies attempting to predict the particular grade level of a text. However, guided reading levels develope...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JASIST

دوره 67  شماره 

صفحات  -

تاریخ انتشار 2016